Sparse Coding for Learning Interpretable Spatio-Temporal Primitives
نویسندگان
چکیده
Sparse coding has recently become a popular approach in computer vision to learn dictionaries of natural images. In this paper we extend the sparse coding framework to learn interpretable spatio-temporal primitives. We formulated the problem as a tensor factorization problem with tensor group norm constraints over the primitives, diagonal constraints on the activations that provide interpretability as well as smoothness constraints that are inherent to human motion. We demonstrate the effectiveness of our approach to learn interpretable representations of human motion from motion capture data, and show that our approach outperforms recently developed matching pursuit and sparse coding algorithms.
منابع مشابه
A Unified Middle-Level Representation for Video
This paper presents a middle-level video representation named Video Primal Sketch (VPS), which integrates two regimes of models: i) sparse coding model using static or moving primitives to explicitly represent moving corners, lines, feature points, etc., ii) FRAME/MRF model reproducing feature statistics extracted from input video to implicitly represent textured motion, such as water and fire....
متن کاملOn spatio-temporal sparse coding: Analysis and an algorithm
Sparse coding is a common approach to learning local features for object recognition. Recently, there has been an increasing interest in learning features from spatio-temporal, binocular, or other multi-observation data, where the goal is to encode the relationship between images rather than the content of a single image. We discuss the role of multiplicative interactions and of squaring non-li...
متن کاملReading Report: Video Primal Sketch
In this report, we discuss a content-based video coding approach, i.e. video primal sketch. Unlike traditional video coding approaches, video primal sketch seeks to capture different mid-level cues and use them to explain the video data. The algorithm first segmented the video data into two different kinds of regions: explicit region and implicit region, and then model each region by a correspo...
متن کاملAction recognition using global spatio-temporal features derived from sparse representations
Recognizing actions is one of the important challenges in computer vision with respect to video data, with applications to surveillance, diagnostics of mental disorders, and video retrieval. Compared to other data modalities such as documents and images, processing video data demands orders of magnitude higher computational and storage resources. One way to alleviate this difficulty is to focus...
متن کاملExtraction of spatio-temporal primitives of emotional body expressions
Experimental and computational studies suggest that complex motor behavior is based on simpler spatiotemporal primitives, or synergies. This has been demonstrated by application of dimensionality reduction techniques to signals obtained by electrophysiological and EMG recordings during the execution of limb movements. However, the existence of spatio-temporal primitives on the level of the join...
متن کامل